Molecular differences between species of the m. tuberculosis complex

ABSTRACT

Specific genetic deletion are identified in mycobacteria isolates, including variations in the  M. tuberculosis  genome sequence between isolates, and numerous deletion present in BCG as compared to M. tb. These deletions are used as markers to distinguish between pathogenic and avirulent strains, and as a marker for particular M. tb. isolates. Deletions specific to vaccine strains of BCG are useful in determining whether a positive tuberculin skin test is indicative of actual tuberculosis infection. The deleted sequences may be reintroduced into BCG to improve the efficacy of vaccination. Alternatively, the genetic sequence that corresponds to the deletion(s) are deleted from  M. bovis  or  M. tuberculosis  to attenuate the pathogenic bacteria.

This invention was made with Government support under contract A101137and A135969 awarded by the National Institutes of Health. The Governmenthas certain rights in this invention.

Tuberculosis is an ancient human scourge that continues to be animportant public health problem worldwide. It is an ongoing epidemic ofstaggering proportions. Approximately one in every three people in theworld is infected with Mycobacterium tuberculosis, and has a 10%lifetime risk of progressing from infection to clinical disease.Although tuberculosis can be treated, an estimated 2.9 million peopledied from the disease last year.

There are significant problems with a reliance on drug treatment tocontrol active M. tuberculosis infections. Most of the regions havinghigh infection rates are less developed countries, which suffer from alack of easily accessible health services, diagnostic facilities andsuitable antibiotics against M. tuberculosis. Even where these areavailable, patient compliance is often poor because of the lengthyregimen required for complete treatment, and multidrug-resistant strainsare increasingly common.

Prevention of infection would circumvent the problems of treatment, andso vaccination against tuberculosis is widely performed in endemicregions. Around 100 million people a year are vaccinated with livebacillus Calmette-Guerin (BCG) vaccine. BCG has the great advantage ofbeing inexpensive and easily administered under less than optimalcircumstances, with few adverse reactions. Unfortunately, the vaccine iswidely variable in its efficacy, providing anywhere from 0 to 80%protection against infection with M. tuberculosis.

BCG has an interesting history. It is an attenuated strain of M. bovis,a very close relative of M. tuberculosis. The M. bovis strain thatbecame BCG was isolated from a cow in the late 1800's by abacteriologist named Nocard, hence it was called Nocard's bacillus. Theattenuation of Nocard's bacillus took place from 1908 to 1921, over thecourse of 230 in vitro passages. Thereafter, it was widely grownthroughout the world, resulting in additional hundreds and sometimethousands of in vitro passages. Throughout its many years in thelaboratory, there has been selection for cross-reaction with thetuberculin skin test, and for decreased side effects. The net resultshave been a substantially weakened pathogen, which may be ineffective inraising an adequate immune response.

New antituberculosis vaccines are urgently needed for the generalpopulation in endemic regions, for HIV-infected individuals, as well ashealth care professionals likely to be exposed to tubercle bacilli.Recombinant DNA vaccines bearing protective genes from virulent M.tuberculosis are being developed using shuttle plasmids to transfergenetic material from one mycobacterial species to another, for examplesee U.S. Pat. No. 5,776,465. Tuberculosis vaccine development should begiven a high priority in current medical research goals.

RELEVANT LITERATURE

Mahairas et al. (1996) J Bacteriol 178(5):1274-1282 provides a molecularanalysis of genetic differences between Mycobacterium bovis BCG andvirulent M. bovis. Subtractive genomic hybridization was used toidentify genetic differences between virulent M. bovis and M.tuberculosis and avirulent BCG. U.S. Pat. No. 5,700,683 is directed tothese genetic differences.

Cole et al. (1998) Nature 393:537-544 have described the complete genomeof M. tuberculosis. To obtain the contiguous genome sequence, a combinedapproach was used that involved the systematic sequence analysis ofselected large-insert clones as well as random small-insert clones froma whole-genome shotgun library. This culminated in a composite sequenceof 4,411,529 base pairs, with a G+C content of 65.6%. 3,924 open readingframes were identified in the genome, accounting for ˜91% of thepotential coding capacity.

Mycobacterium tuberculosis (M.tb.) genomic sequence is available atseveral internet sites.

SUMMARY OF THE INVENTION

Genetic markers are provided that distinguish between strains of theMycobacterium tuberculosis complex, particularly between avirulent andvirulent strains. Strains of interest include M. bovis, M. bovis BCGstrains, M. tuberculosis (M. tb.) isolates, and bacteriophages thatinfect mycobacteria. The genetic markers are used for assays, e.g.immunoassays, that distinguish between strains, such as to differentiatebetween BCG immunization and M. tb. infection. The protein products maybe produced and used as an immunogen, in drug screening, etc. Themarkers are useful in constructing genetically modified M. tb or M.bovis cells having improved vaccine characteristics.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Specific genetic deletions are identified that serve as markers todistinguish between avirulent and virulent mycobacteria strains,including M. bovis, M. bovis BCG strains, M. tuberculosis (M. tb.)isolates, and bacteriophages that infect mycobacteria. These deletionsare used as genetic markers to distinguish between the differentmycobacteria. The deletions may be introduced into M. tb. or M. bovis byrecombinant methods in order to render a pathogenic strain avirulent.Alternatively, the deleted genes are identified in the M. tb. genomesequence, and are then reintroduced by recombinant methods into BCG orother vaccine strains, in order to improve the efficacy of vaccination.

The deletions of the invention are identified by comparative DNAhybridizations from genomic sequence of mycobacterium to a DNAmicroarray comprising representative sequences of the M. tb. codingsequences. The deletions are then mapped to the known M. tb. genomesequence in order to specifically identify the deleted gene(s), and tocharacterize nucleotide sequence of the deleted region.

Nucleic acids comprising the provided deletions and junctions are usedin a variety of applications. Hybridization probes may be obtained fromthe known M. tb. sequence which correspond to the deleted sequences.Such probes are useful in distinguishing between mycobacteria. Forexample, there is a 10% probability that an M. tb. infected person willprogress to clinical disease, but that probability may vary depending ofthe particular infecting strain. Analysis for the presence or absence ofthe deletions provided below as “M. tb. variable” is used to distinguishbetween different M. tb. strains. The deletions are also useful inidentifying whether a patient that is positive for a tuberculin skintest has been infected with M. tb. or with BCG.

In another embodiment of the invention, mycobacteria are geneticallyaltered to delete sequences identified herein as absent in attenuatedstrains, but present in pathogenic strains, e.g. deletions found in BCGbut present in M. tb. H37Rv. Such genetically engineered strains mayprovide superior vaccines to the present BCG isolates in use.Alternatively, BCG strains may be “reconstructed” to more closelyresemble wild-type M. tb. by inserting certain of the deleted sequencesback into the genome. Since the protein products of the deletedsequences are expressed in virulent mycobacterial species, the encodedproteins are useful as immunogens for vaccination.

The attenuation (loss of virulence) in BCG is attributed to the loss ofgenetic material at a number of places throughout the genome. Theselection over time for fewer side-effects resulting from BCGimmunization, while retaining cross-reactivity with the tuberculin skintest, has provided an excellent screen for those sequences that engenderside effects. The identification of deletions that vary between BCGisolates identifies such sequences, which may be used in drug screeningand biological analysis for the role of the deleted genes in causinguntoward side effects and pathogenicity.

Identification of M. Tuberculosis Complex Deletion Markers

The present invention provides nucleic acid sequences that are markersfor specific mycobacteria, including M. tb., M. bovis, BCG andbacteriophage. The deletions are listed in Table 1. The absence orpresence of these marker sequences is characteristic of the indicatedisolate, or strain. As such, they provide a unique characteristic forthe identification of the indicated mycobacteria. The deletions areidentified by their M. tb. open reading frame (“Rv” nomenclature), whichcorresponds to a known genetic sequence, and may be accessed aspreviously cited. The junctions of the deletions are provided by thedesignation of position in the publicly available M. tb. sequence.

TABLE 1 SEQ ID rd rv_num orf_id breakpoint SEQ ID NO: 1 RD01 Rv3871MTV027.06 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 2 RD01 Rv3872MTV027.07 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 3 RD01 Rv3873MTV027.08 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 4 RD01 Rv3874MTV027.09 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 5 RD01 Rv3875MTV027.10 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 6 RD01 Rv3876MTV027.11 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 7 RD01 Rv3877MTV027.12 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 8 RD01 Rv3878MTV027.13 “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 9 RD01 Rv3879cMTV027.14c “H37Rv, segment 160: 7534, 16989” SEQ ID NO: 10 RD02 Rv1988MTCY39.31c “H37Rv segment 88: 14211, segment 89: 8598” SEQ ID NO: 11RD02 Rv1987 MTCY39.32c “H37Rv segment 88: 14211, segment 89: 8598” SEQID NO: 12 RD02 Rv1986 MTCY39.33c “H37Rv segment 88: 14211, segment 89:8598” SEQ ID NO: 13 RD02 Rv1985c MTCY39.34 “H37Rv segment 88: 14211,segment 89: 8598” SEQ ID NO: 14 RD02 Rv1984c MTCY39.35 “H37Rv segment88: 14211, segment 89: 8598” SEQ ID NO: 15 RD02 Rv1983 MTCY39.36c “H37Rvsegment 88: 14211, segment 89: 8598” SEQ ID NO: 16 RD02 Rv1982cMTCY39.37 “H37Rv segment 88: 14211, segment 89: 8598” SEQ ID NO: 17 RD02Rv1981c MTCY39.38 “H37Rv segment 88: 14211, segment 89: 8598” SEQ ID NO:18 RD02 Rv1980c MTCY39.39 “H37Rv segment 88: 14211, segment 89: 8598”SEQ ID NO: 19 RD02 Rv1979c MTCY39.40 “H37Rv segment 88: 14211, segment89: 8598” SEQ ID NO: 20 RD02 Rv1978 MTV051.16 “H37Rv segment 88: 14211,segment 89: 8598” SEQ ID NO: 21 RD03 Rv1586c MTCY336.18 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 22 RD03 Rv1585c MTCY336.19 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 23 RD03 Rv1584c MTCY336.20 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 24 RD03 Rv1583c MTCY336.21 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 25 RD03 Rv1582c MTCY336.22 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 26 RD03 Rv1581c MTCY336.23 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 27 RD03 Rv1580c MTCY336.24 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 28 RD03 Rv1579c MTCY336.25 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 29 RD03 Rv1578c MTCY336.26 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 30 RD03 Rv1577c MTCY336.27 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 31 RD03 Rv1576c MTCY336.28 “H37Rv, segment70: 7677, 16923” SEQ ID NO: 32 RD03 Rv1575 MTCY336.29c “H37Rv, segment70: 7677, 16923” SEQ ID NO: 33 RD03 Rv1574 MTCY336.30c “H37Rv, segment70: 7677, 16923” SEQ ID NO: 34 RD03 Rv1573 MTCY336.31c “H37Rv, segment70: 7677, 16923” SEQ ID NO: 35 RD04 Rv0221 MTCY08D5.16 “H37Rv, segment12: 17432, 19335” SEQ ID NO: 36 RD04 Rv0222 MTCY08D5.17 “H37Rv, segment12: 17432, 19335” SEQ ID NO: 37 RD04 Rv0223c MTCY08D5.18 “H37Rv, segment12: 17432, 19335” SEQ ID NO: 38 RD05 Rv3117 MTCY164.27 “H37Rv, segment135: 27437, 30212” SEQ ID NO: 39 RD05 Rv3118 MTCY164.28 “H37Rv, segment135: 27437, 30212” SEQ ID NO: 40 RD05 Rv3119 MTCY164.29 “H37Rv, segment135: 27437, 30212” SEQ ID NO: 41 RD05 Rv3120 MTCY164.30 “H37Rv, segment135: 27437, 30212” SEQ ID NO: 42 RD05 Rv3121 MTCY164.31 “H37Rv, segment135: 27437, 30212” SEQ ID NO: 43 RD06 Rv1506c MTCY277.28c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 44 RD06 Rv1507c MTCY277.29c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 45 RD06 Rv1508c MTCY277.30c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 46 RD06 Rv1509 MTCY277.31 “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 47 RD06 Rv1510 MTCY277.32 “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 48 RD06 Rv1511 MTCY277.33 “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 49 RD06 Rv1512 MTCY277.34 “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 50 RD06 Rv1513 MTCY277.35 “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 51 RD06 Rv1514c MTCY277.36c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 52 RD06 Rv1515c MTCY277.37c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 53 RD06 Rv1516c MTCY277.38c “H37Rv,segment 65: 23614, 36347” SEQ ID NO: 54 RD07 Rv2346c MTCY98.15c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 55 RD07 Rv2347c MTCY98.16c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 56 RD07 Rv2348c MTCY98.17c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 57 RD07 Rv2349c MTCY98.18c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 58 RD07 Rv2350c MTCY98.19c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 59 RD07 Rv2351c MTCY98.20c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 60 RD07 Rv2352c MTCY98.21c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 61 RD07 Rv2353c MTCY98.22c “H37Rv,segment 103: 17622, 26584” SEQ ID NO: 62 RD08 Rv0309 MTCY63.14 “H37Rv,segment 16: 17018, 20446” SEQ ID NO: 63 RD08 Rv0310c MTCY63.15c “H37Rv,segment 16: 17018, 20446” SEQ ID NO: 64 RD08 Rv0311 MTCY63.16 “H37Rv,segment 16: 17018, 20446” SEQ ID NO: 65 RD08 Rv0312 MTCY63.17 “H37Rv,segment 16: 17018, 20446” SEQ ID NO: 66 RD09 Rv3623 MTCY15C10.29c“H37Rv, segment 153: 21131, segment 154: 2832” SEQ ID NO: 67 RD09Rv3622c MTCY15C10.30 “H37Rv, segment 153: 21131, segment 154: 2832” SEQID NO: 68 RD09 Rv3621c MTCY15C10.31 “H37Rv, segment 153: 21131, segment154: 2832” SEQ ID NO: 69 RD09 Rv3620c MTCY15C10.32 “H37Rv, segment 153:21131, segment 154: 2832” SEQ ID NO: 70 RD09 Rv3619c MTCY15C10.33“H37Rv, segment 153: 21131, segment 154: 2832” SEQ ID NO: 71 RD09 Rv3618MTCY15C10.34c “H37Rv, segment 153: 21131, segment 154: 2832” SEQ ID NO:72 RD09 Rv3617 MTCY15C10.35c “H37Rv, segment 153: 21131, segment 154:2832” SEQ ID NO: 73 RD10 Rv1257c MTCY50.25 “H37Rv segment 55: 3689,6696” SEQ ID NO: 74 RD10 Rv1256c MTCY50.26 “H37Rv segment 55: 3689,6696” SEQ ID NO: 75 RD10 Rv1255c MTCY50.27 “H37Rv segment 55: 3689,6696” SEQ ID NO: 76 RD11 Rv3429 MTCY77.01 “H37Rv, segment 145: 30303 tosegment 146: 1475” SEQ ID NO: 77 RD11 Rv3428c MTCY78.01 “H37Rv, segment145: 30303 to segment 146: 1475” SEQ ID NO: 78 RD11 Rv3427c MTCY78.02“H37Rv, segment 145: 30303 to segment 146: 1475” SEQ ID NO: 79 RD11Rv3426 MTCY78.03c “H37Rv, segment 145: 30303 to segment 146: 1475” SEQID NO: 80 RD11 Rv3425 MTCY78.04c “H37Rv, segment 145: 30303 to segment146: 1475” SEQ ID NO: 81 RD12 Rv2072c MTCY49.11c “H37Rv segment 93:9301, 11331” SEQ ID NO: 82 RD12 Rv2073c MTCY49.12c “H37Rv segment 93:9301, 11331” SEQ ID NO: 83 RD12 Rv2074 MTCY49.13 “H37Rv segment 93:9301, 11331” SEQ ID NO: 84 RD12 Rv2075c MTCY49.14c “H37Rv segment 93:9301, 11331” SEQ ID NO: 85 RD13bis Rv2645 MTCY441.15 “H37Rv, segment118: 12475, 23455” SEQ ID NO: 86 RD13bis Rv2646 MTCY441.16 “H37Rv,segment 118: 12475, 23455” SEQ ID NO: 87 RD13bis Rv2647 MTCY441.17“H37Rv, segment 118: 12475, 23455” SEQ ID NO: 88 RD13bis Rv2648MTCY441.17A “H37Rv, segment 118: 12475, 23455” SEQ ID NO: 89 RD13bisRv2649 MTCY441.18 “H37Rv, segment 118: 12475, 23455” SEQ ID NO: 90RD13bis Rv2650c MTCY441.19 “H37Rv, segment 118: 12475, 23455” SEQ ID NO:91 RD13bis Rv2651c MTCY441.20c “H37Rv, segment 118: 12475, 23455” SEQ IDNO: 92 RD13bis Rv2652c MTCY441.21c “H37Rv, segment 118: 12475, 23455”SEQ ID NO: 93 RD13bis Rv2653c MTCY441.22c “H37Rv, segment 118: 12475,23455” SEQ ID NO: 94 RD13bis Rv2654c MTCY441.23c “H37Rv, segment 118:12475, 23455” SEQ ID NO: 95 RD13bis Rv2655c MTCY441.24c “H37Rv, segment118: 12475, 23455” SEQ ID NO: 96 RD13bis Rv2656c MTCY441.25c “H37Rv,segment 118: 12475, 23455” SEQ ID NO: 97 RD13bis Rv2657c MTCY441.26c“H37Rv, segment 118: 12475, 23455” SEQ ID NO: 98 RD13bis Rv2658cMTCY441.27c “H37Rv, segment 118: 12475, 23455” SEQ ID NO: 99 RD13bisRv2659c MTCY441.28c “H37Rv, segment 118: 12475, 23455” SEQ ID NO: 100RD13bis Rv2660c MTCY441.29c “H37Rv, segment 118: 12475, 23455” SEQ IDNO: 101 RD14 Rv1766 MTCY28.32 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 102 RD14 Rv1767 MTCY28.33 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 103 RD14 Rv1768 MTCY28.34 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 104 RD14 Rv1769 MTCY28.35 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 105 RD14 Rv1770 MTCY28.36 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 106 RD14 Rv1771 MTCY28.37 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 107 RD14 Rv1772 MTCY28.38 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 108 RD14 Rv1773c MTCY28.39 “H37Rv segment 79: 30573, 39642” SEQ IDNO: 109 RD15 Rv1963c MTV051.01c “H37Rv segment 88: 1153, 13873” SEQ IDNO: 110 RD15 Rv1964 MTV051.02 “H37Rv segment 88: 1153, 13873” SEQ ID NO:111 RD15 Rv1965 MTV051.03 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 112RD15 Rv1966 MTV051.04 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 113RD15 Rv1967 MTV051.05 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 114RD15 Rv1968 MTV051.06 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 115RD15 Rv1969 MTV051.07 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 116RD15 Rv1970 MTV051.08 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 117RD15 Rv1971 MTV051.09 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 118RD15 Rv1972 MTV051.10 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 119RD15 Rv1973 MTV051.11 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 120RD15 Rv1974 MTV051.12 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 121RD15 Rv1975 MTV051.13 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 122RD15 Rv1976c MTV051.14 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 123RD15 Rv1977 MTV051.15 “H37Rv segment 88: 1153, 13873” SEQ ID NO: 124RD16 Rv3405c MTCY78.23 “H37Rv, segment 145: 5012, 12621” SEQ ID NO: 125RD16 Rv3404c MTCY78.24 “H37Rv, segment 145: 5012, 12621” SEQ ID NO: 126RD16 Rv3403c MTCY78.25 “H37Rv, segment 145: 5012, 12621” SEQ ID NO: 127RD16 Rv3402c MTCY78.26 “H37Rv, segment 145: 5012, 12621” SEQ ID NO: 128RD16 Rv3401 MTCY78.27c “H37Rv, segment 145: 5012, 12621” SEQ ID NO: 129RD16 Rv3400 MTCY78.28c “H37Rv, segment 145: 5012, 12621”

The “Rv” column indicates public M. tb. sequence, open reading frame.The BCG strains were obtained as follows:

TABLE 2 Strains employed in study of BCG phylogeny Name of strainSynonym Source Descriptors BCG-Russia Moscow ATCC # 35740 BCG-MoreauBrazil ATCC # 35736 BCG-Moreau Brazil IAF dated 1958 BCG-Moreau BrazilIAF dated 1961 BCG-Japan Tokyo ATCC # 35737 BCG-Japan Tokyo IAF dated1961 BCG-Japan Tokyo JATA vaccine strain BCG-Japan Tokyo JATA bladdercancer strain BCG-Japan Tokyo JATA clinical isolate- adenitis BCG-SwedenGothenburg ATCC # 35732 BCG-Sweden Gothenburg IAF dated 1958 BCG-SwedenGothenburg SSI production lot, Copenhagen BCG-Phipps Philadelphia ATCC #35744 BCG-Denmark Danish 1331 ATCC # 35733 BCG-Copenhagen ATCC #27290BCG-Copenhagen IAF dated 1961 BCG-Tice Chicago vaccine dated 1973BCG-Tice Chicago ATCC # 35743 BCG-Frappier Montreal IAF primary lot,1973 BCG-Frappier, INH- Montreal-R IAF primary lot, 1973 resistantBCG-Frappier Montreal IAF passage 946 BCG-Connaught Toronto CL bladdercancer treatment BCG-Birkhaug ATCC # 35731 BCG-Prague Czech SSIlyophilized 1968 BCG-Glaxo vaccine dated 1973 BCG-Glaxo ATCC # 35741BCG-Pasteur IAF passage 888 BCG-Pasteur IAF dated 1961 BCG-Pasteur IP1173P2-B BCG-Pasteur IP 1173P2-C BCG-Pasteur IP clinical isolate # 1BCG-Pasteur IP clinical isolate # 2 BCG-Pasteur ATCC # 35734Abbreviations: IP = Institut Pasteur, Paris, France; IAF = InstitutArmand Frappier, Laval, Canada; ATCC = American Type Culture Collection,Rockville, Md, USA; SSI = Statens Serum Institute, Copenhagen, Denmark;CL = Connaught Laboratories, Willowdale, Canada, JATA = JapaneseAnti-Tuberculosis Association; INH = isoniazid. Canadian BCG's refers toBCG-Montreal and BCG-Toronto, the latter being derived from the former.

In performing the initial screening method, genomic DNA is isolated fromtwo mycobacteria microbial cell cultures. The two DNA preparations arelabeled, where a different label is used for the first and secondmicrobial cultures, typically using nucleotides conjugated to afluorochrome that emits at a wavelength substantially different fromthat of the fluorochrome tagged nucleotides used to label the selectedprobe. The strains used were the reference strain of Mycobacteriumtuberculosis (H37Rv), other M. tb. laboratory strains, such as H37Ra,the O strain, M. tb. clinical isolates, the reference strain ofMycobacterium bovis, and different strains of Mycobacterium bovis BCG.

The two DNA preparations are mixed, and competitive hybridization iscarried out to a microarray representing all of the open reading framesin the genome of the test microbe, usually H37Rv. Hybridization of thelabeled sequences is accomplished according to methods well known in theart. In a preferred embodiment, the two probes are combined to providefor a competitive hybridization to a single microarray. Hybridizationcan be carried out under conditions varying in stringency, preferablyunder conditions of high stringency (e.g., 4×SSC, 10% SDS, 65° C.) toallow for hybridization of complementary sequences having extensivehomology (e.g., having at least 85% sequence identity, preferably atleast 90% sequence identity, more preferably having at least 95%sequence identity). Where the target sequences are native sequences thehybridization is preferably carried out under conditions that allowhybridization of only highly homologous sequences (e.g., at least 95% to100% sequence identity).

Two color fluorescent hybridization is utilized to assay therepresentation of the unselected library in relation to the selectedlibrary (i.e., to detect hybridization of the unselected probe relativeto the selected probe). From the ratio of one color to the other, forany particular array element, the relative abundance of that sequence inthe unselected and selected libraries can be determined. In addition,comparison of the hybridization of the selected and unselected probesprovides an internal control for the assay. An absence of signal fromthe reference strain, as compared to H37Rv, is indicative that the openreading frame is deleted in the test strain. The deletion may be furthermapped by Southern blot analysis, and by sequencing the regions flankingthe deletion.

Microarrays can be scanned to detect hybridization of the selected andthe unselected sequences using a custom built scanning laser microscopeas described in Shalon et al., Genome Res. 6:639 (1996). A separatescan, using the appropriate excitation line, is performed for each ofthe two fluorophores used. The digital images generated from the scanare then combined for subsequent analysis. For any particular arrayelement, the ratio of the fluorescent signal from the amplified selectedcell population DNA is compared to the fluorescent signal from theunselected cell population DNA, and the relative abundance of thatsequence in the selected and unselected library determined.

Nucleic Acid Compositions

As used herein, the term “deletion marker”, or “marker” is used to referto those sequences of M. tuberculosis complex genomes that are deletedin one or more of the strains or species, as indicated in Table 1. Thebacteria of the M. tuberculosis complex include M. tuberculosis, M.bovis, and BCG, inclusive of varied isolates and strains within eachspecies. Nucleic acids of interest include all or a portion of thedeleted region, particularly complete open reading frames, hybridizationprimers, promoter regions, etc.

The term “junction” or “deletion junction” is used to refer to nucleicacids that comprise the regions on both the 3′ and the 5′ sequenceimmediately flanking the deletion. Such junction sequences arepreferably used as short primers, e.g. from about 15 nt to about 30 nt,that specifically hybridize to the junction, but not to a nucleic acidcomprising the undeleted genomic sequence. For example, the deletionfound in M. bovis, at Rv0221, corresponds to the nucleotide sequence ofthe M. tuberculosis H37Rv genome, segment 12: 17432,19335. The junctioncomprises the regions upstream of position 17342, and downstream of19335, e.g. a nucleic acid of 20 nucleotides comprising the sequencefrom H37Rv 17332-17342 joined to 19335-19345.

Typically, such nucleic acids comprising a junction will include atleast about 7 nucleotides from each flanking region, i.e. from the 3′and from the 5′ sequences adjacent to the deletion, and may be about 10nucleotides from each flanking region, up to about 15 nucleotides, ormore. Amplification primers that hybridize to the junction sequence, tothe deleted sequence, and to the flanking non-deleted regions have avariety of uses, as detailed below.

The nucleic acid compositions of the subject invention encode all or apart of the deletion markers. Fragments may be obtained of the DNAsequence by chemically synthesizing oligonucleotides in accordance withconventional methods, by restriction enzyme digestion, by PCRamplification, etc. For the most part, DNA fragments will be at leastabout 25 nt in length, usually at least about 30 nt, more usually atleast about 50 nt. For use in amplification reactions, such as PCR, apair of primers will be used. The exact composition of the primersequences is not critical to the invention, but for most applicationsthe primers will hybridize to the subject sequence under stringentconditions, as known in the art. It is preferable to chose a pair ofprimers that will generate an amplification product of at least about 50nt, preferably at least about 100 nt. Algorithms for the selection ofprimer sequences are generally known, and are available in commercialsoftware packages. Amplification primers hybridize to complementarystrands of DNA, and will prime towards each other.

Usually, the DNA will be obtained substantially free of other nucleicacid sequences that do not include a deletion marker sequence orfragment thereof, generally being at least about 50%, usually at leastabout 90% pure and are typically “recombinant”, i.e. flanked by one ormore nucleotides with which it is not normally associated on a naturallyoccurring chromosome.

For screening purposes, hybridization probes of one or more of thedeletion sequences may be used in separate reactions or spatiallyseparated on a solid phase matrix, or labeled such that they can bedistinguished from each other. Assays may utilize nucleic acids thathybridize to one or more of the described deletions.

An array may include all or a subset of the deletion markers listed inTable 1. Usually such an array will include at least 2 differentdeletion marker sequences, i.e. deletions located at unique positionswithin the locus, and may include all of the provided deletion markers.Arrays of interest may further comprise other genetic sequences,particularly other sequences of interest for tuberculosis screening. Theoligonucleotide sequence on the array will usually be at least about 12nt in length, may be the length of the provided deletion markersequences, or may extend into the flanking regions to generate fragmentsof 100 to 200 nt in length. For examples of arrays, see Ramsay (1998)Nat. Biotech. 16:40-44; Hacia et al. (1996) Nature Genetics 14:441-447;Lockhart et al. (1996) Nature Biotechnol. 14:1675-1680; and De Risi etal. (1996) Nature Genetics 14:457-460.

Nucleic acids may be naturally occurring, e.g. DNA or RNA, or may besynthetic analogs, as known in the art. Such analogs may be preferredfor use as probes because of superior stability under assay conditions.Modifications in the native structure, including alterations in thebackbone, sugars or heterocyclic bases, have been shown to increaseintracellular stability and binding affinity. Among useful changes inthe backbone chemistry are phosphorothioates; phosphorodithioates, whereboth of the non-bridging oxygens are substituted with sulfur;phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiralphosphate derivatives include 3′-O′-5′-S-phosphorothioate,3′-S-5′-O-phosphorothioate, 3′-CH₂-5′-O-phosphonate and3′-NH-5′-O-phosphoroamidate. Peptide nucleic acids replace the entireribose phosphodiester backbone with a peptide linkage.

Sugar modifications are also used to enhance stability and affinity. Theα-anomer of deoxyribose may be used, where the base is inverted withrespect to the natural b-anomer. The 2′-OH of the ribose sugar may bealtered to form 2′-O-methyl or 2′-O-allyl sugars, which provideresistance to degradation without comprising affinity.

Modification of the heterocyclic bases must maintain proper basepairing. Some useful substitutions include deoxyuridine fordeoxythymidine; 5-methyl-2′-deoxycytidine and 5-bromo-2′-deoxycytidinefor deoxycytidine. 5-propynyl-2′-deoxyuridine and5-propynyl-2′-deoxycytidine have been shown to increase affinity andbiological activity when substituted for deoxythymidine anddeoxycytidine, respectively.

Polypeptide Compositions

The specific deletion markers in Table 1 correspond to open readingframes of the M. tb. genome, and therefore encode a polypeptide. Thesubject markers may be employed for synthesis of a complete protein, orpolypeptide fragments thereof, particularly fragments corresponding tofunctional domains; binding sites; etc.; and including fusions of thesubject polypeptides to other proteins or parts thereof. For expression,an expression cassette may be employed, providing for a transcriptionaland translational initiation region, which may be inducible orconstitutive, where the coding region is operably linked under thetranscriptional control of the transcriptional initiation region, and atranscriptional and translational termination region. Varioustranscriptional initiation regions may be employed that are functionalin the expression host.

In the present specification and claims, the term “polypeptidefragments”, or variants thereof, denotes both short peptides with alength of at least two amino acid residues and at most 10 amino acidresidues, oligopeptides with a length of at least 11 amino acidresidues, 20 amino acid residues, 50 amino acid residues, and up toabout 100 amino acid residues; and longer peptides of greater than 100amino acid residues up to the complete length of the native polypeptide.

The term substantially pure polypeptide fragment means a polypeptidepreparation which contains at most 5% by weight of other polypeptidematerial with which it is natively associated, and lower percentages arepreferred, e.g. at most 4%, at most 3%, at most 2%, at most 1%, and atmost 0.5%. It is preferred that the substantially pure polypeptide is atleast 96% pure, i.e. that the polypeptide constitutes at least 96% byweight of total polypeptide material present in the preparation, andhigher percentages are preferred, such as at least 97%, at least 98%, atleast 99%, at least 99.25%, at least 99.5%, and at least 99.75%. It isespecially preferred that the polypeptide fragment is essentially freeof any other antigen with which it is natively associated, i.e. free ofany other antigen from bacteria belonging to the tuberculosis complex.This can be accomplished by preparing the polypeptide fragment by meansof recombinant methods in a non-mycobacterial host, or by synthesizingthe polypeptide fragment by the well-known methods of solid or liquidphase peptide synthesis, e.g. by the method described by Merrifield orvariations thereof.

The M. tuberculosis polypeptide antigens provided herein includevariants that are encoded by DNA sequences that are substantiallyhomologous to one or more of the DNA sequences specifically recitedherein, for example variants having at least 80%, 85%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% sequence identity.

In a preferred embodiment of the invention, polypeptide fragmentsprovide for an epitope of the deletion marker. The binding site ofantibodies typically utilizes multiple non-covalent interactions toachieve high affinity binding. While a few contact residues of theantigen may be brought into close proximity to the binding pocket, otherparts of the antigen molecule can also be required for maintaining aconformation that permits binding. The portion of the antigen bound bythe antibody is referred to as an epitope. As used herein, an epitope isthat portion of the antigen that is sufficient for high affinitybinding. In a polypeptide antigen, generally a linear epitope will be atleast about 7 amino acids in length, and may be at least 8, at least 9,at least 10, at least 11, at least 12, at least 14, at least 16, atleast 18, at least 20, at least 22, at least 24,or at least 30 aminoacid residues in length. However, antibodies may also recognizeconformational determinants formed by non-contiguous residues on anantigen, and an epitope can therefore require a larger fragment of theantigen to be present for binding, e.g. a domain, or up to substantiallyall of a protein sequence. For each antigen there exists a plurality ofepitopes that, in sum, represent the immunologic determinants of thatantigen, although there are instances in which an antigen contains asingle epitope.

The level of affinity of antibody binding that is considered to be“specific” will be determined in part by the class of antibody, e.g.antigen specific antibodies of the IgM class may have a lower affinitythan antibodies of, for example, the IgG classes. As used herein, inorder to consider an antibody interaction to be “specific”, the affinitywill be at least about 10⁻⁷ M, usually about 10^(−8 to −9) M, and may beup to 10⁻¹¹ or higher for the epitope of interest. It will be understoodby those of skill in the art that the term “specificity” refers to sucha high affinity binding, and is not intended to mean that the antibodycannot bind to other molecules as well. One may find cross-reactivitywith different epitopes, due, e.g. to a relatedness of antigen sequenceor structure, or to the structure of the antibody binding pocket itself.Antibodies demonstrating such cross-reactivity are still consideredspecific for the purposes of the present invention.

Polypeptide sequences include analogs and variants produced byrecombinant methods wherein such nucleic acids and polypeptide sequencesare modified by substitution, insertion, addition, and/or deletion ofone or more nucleotides in the nucleic acid sequence to cause thesubstitution, insertion, addition, and/or deletion of one or more aminoacid residues in the recombinant polypeptide.

The polypeptides may be expressed in prokaryotes or eukaryotes inaccordance with conventional ways, depending upon the purpose forexpression. For large scale production of the protein, a unicellularorganism, such as E. coli, B. subtilis, S. cerevisiae, or cells of ahigher organism such as vertebrates, particularly mammals, e.g. COS 7cells, may be used as the expression host cells. Small peptides can alsobe synthesized in the laboratory.

With the availability of the polypeptides in large amounts, by employingan expression host, the polypeptides may be isolated and purified inaccordance with conventional ways. A lysate may be prepared of theexpression host and the lysate purified using HPLC, exclusionchromatography, gel electrophoresis, affinity chromatography, or otherpurification technique. The purified polypeptide will generally be atleast about 80% pure, preferably at least about 90% pure, and may be upto and including 100% pure. Pure is intended to mean free of otherproteins, as well as cellular debris.

The polypeptide is used for the production of antibodies, where shortfragments provide for antibodies specific for the particularpolypeptide, and larger fragments or the entire protein allow for theproduction of antibodies over the surface of the polypeptide. Antibodiesmay be raised to isolated peptides corresponding to particular domains,or to the native protein.

Antibodies are prepared in accordance with conventional ways, where theexpressed polypeptide or protein is used as an immunogen, by itself orconjugated to known immunogenic carriers, e.g. KLH, pre-S HBsAg, otherviral or eukaryotic proteins, or the like. Various adjuvants may beemployed, with a series of injections, as appropriate. For monoclonalantibodies, after one or more booster injections, the spleen isisolated, the lymphocytes immortalized by cell fusion, and then screenedfor high affinity antibody binding. The immortalized cells, i.e.hybridomas, producing the desired antibodies may then be expanded. Forfurther description, see Monoclonal Antibodies: A Laboratory Manual,Harlow and Lane eds., Cold Spring Harbor Laboratories, Cold SpringHarbor, N.Y., 1988. If desired, the mRNA encoding the heavy and lightchains may be isolated and mutagenized by cloning in E. coli, and theheavy and light chains mixed to further enhance the affinity of theantibody. Alternatives to in vivo immunization as a method of raisingantibodies include binding to phage “display” libraries, usually inconjunction with in vitro affinity maturation.

The antibody may be produced as a single chain, instead of the normalmultimeric structure. Single chain antibodies are described in Jost etal. (1994) J.B.C. 269:26267-73, and others. DNA sequences encoding thevariable region of the heavy chain and the variable region of the lightchain are ligated to a spacer encoding at least about 4 amino acids ofsmall neutral amino acids, including glycine and/or serine. The proteinencoded by this fusion allows assembly of a functional variable regionthat retains the specificity and affinity of the original antibody.

Vaccines may be formulated according to methods known in the art.Vaccines of the polypeptides as described above or modified bacteria areadministered to a host which may be exposed to virulent tuberculosis. Inmany countries where tuberculosis is endemic, vaccination may beperformed at birth, with additional vaccinations as necessary. Thecompounds of the present invention are administered at a dosage thatprovides effective immunity while minimizing any side-effects. It iscontemplated that the composition will be obtained and used under theguidance of a physician.

Conventional vaccine strains of BCG may be formulated in a combinationvaccine with polypeptides identified in the present invention andproduced as previously described, in order to improve the efficacy ofthe vaccine.

In one method, a dose of the deletion marker polypeptide, formulated asa cocktail of proteins or as individual protein species, in a suitablemedium is injected into the patient. The dose will usually be at leastabout 0.05 μg of protein, and usually not more than about 5 μg ofprotein.

Various methods for administration may be employed. The formulation maybe injected intramuscularly, intravascularly, subcutaneously, etc. Thedosage will be conventional. The bacteria can be formulated intopharmaceutical compositions by combination with appropriate,pharmaceutically acceptable carriers or diluents, and may be formulatedinto preparations in semi-solid or liquid forms, such as solutions,injections, etc. The following methods and excipients are merelyexemplary and are in no way limiting.

The polypeptide or modified bacteria can be formulated into preparationsfor injections by dissolving, suspending or emulsifying them in anaqueous or nonaqueous solvent, such as vegetable or other similar oils,synthetic aliphatic acid glycerides, esters of higher aliphatic acids orpropylene glycol; and if desired, with conventional additives such assolubilizers, isotonic agents, suspending agents, emulsifying agents,stabilizers and preservatives. Unit dosage forms for injection orintravenous administration may comprise the bacteria or polypeptide ofthe present invention in a composition as a solution in sterile water,normal saline or another pharmaceutically acceptable carrier.

The term “unit dosage form,” as used herein, refers to physicallydiscrete units suitable as unitary dosages for human and animalsubjects, each unit containing a predetermined quantity of vaccine,calculated in an amount sufficient to produce the desired effect inassociation with a pharmaceutically acceptable diluent, carrier orvehicle. The specifications for the unit dosage forms of the presentinvention depend on the particular bacteria employed and the effect tobe achieved, and the pharmacodynamics associated with each complex inthe host.

The pharmaceutically acceptable excipients, such as vehicles, adjuvants,carriers or diluents, are readily available to the public. Moreover,pharmaceutically acceptable auxiliary substances, such as pH adjustingand buffering agents, tonicity adjusting agents, stabilizers, wettingagents and the like, are readily available to the public.

Mycobacterium, particularly those of the M. tuberculosis complex, aregenetically engineered to contain specific deletions or insertionscorresponding to the identified genetic markers. In particular,attenuated BCG strains are modified to introduce deleted genes encodingsequences important in the establishment of effective immunity.Alternatively, M. bovis or M. tuberculosis are modified by homologousrecombination to create specific deletions in sequences that determinevirulence, i.e. the bacteria are attenuated through recombinanttechniques.

In order to stably introduce sequences into BCG, the M. tb. open readingframe corresponding to one of the deletions in Table 1 is inserted intoa vector that is maintained in M. bovis strains. Preferably, the native5′ and 3′ flanking sequences are included, in order to provide forsuitable regulation of transcription and translation. However, inspecial circumstances, exogenous promoters and other regulatory regionsmay be included. Vectors and methods of transfection for BCG are knownin the art. For example, U.S. Pat. No. 5,776,465, herein incorporated byreference, describes the introduction of exogenous genes into BCG.

In one embodiment of the invention, the complete deleted region isreplaced in BCG. The junctions of the deletion are determined ascompared to a wild type M. tb. or M. bovis sequence, for example as setforth in the experimental section. The deleted region is cloned by anyconvenient method, as known in the art, e.g. PCR amplification of theregion, restriction endonuclease digestion, chemical synthesis, etc.Preferably the cloned region will further comprise flanking sequences ofa length sufficient to induce homologous recombination, usually at leastabout 25 nt, more usually at least about 100 nt, or greater. Suitablevectors and methods are known in the art, for an example, see Norman etal. (1995) Mol. Microbiol. 16:755-760.

In an alternative embodiment, one or more of the deletions provided inTable 1 are introduced into a strain of M. tuberculosis or M. bovis.Preferably such a strain is reduced in virulence, e.g. H37Ra, etc.Methods of homologous recombination in order to effect deletions inmycobacteria are known in the art, for example, see Norman et al.,supra.; Ganjam et al. (1991) P.N.A.S. 88:5433-5437; and Aldovini et al.(1993) J. Bacteriol. 175:7282-7289. Deletions may comprise an openreading frame identified in Table 1, or may extend to the full deletion,i.e. extending into flanking regions, and may include multiple openreading frames.

The ability of the genetically altered mycobacterium to cause diseasemay be tested in one or more experimental models. For example, M. tb. isknown to infect a variety of animals, and cells in culture. In oneassay, mammalian macrophages, preferably human macrophages, areinfected. In a comparison of virulent, avirulent and attenuated strainsof the M. tuberculosis complex, alveolar or peripheral blood monocytesare infected at a 1:1 ratio (Silver et al. (1998) Infect Immun66(3):1190-1199; Paul et al. (1996) J Infect Dis 174(1):105-112.) Thepercentages of cells infected by the strains and the initial numbers ofintracellular organisms are equivalent, as were levels of monocyteviability up to 7 days following infection. However, intracellulargrowth reflects virulence, over a period of one or more weeks.Mycobacterial growth may be evaluated by acid-fast staining, electronmicroscopy, and colony-forming units (cfu) assays. Monocyte productionof tumor necrosis factor alpha may also be monitored as a marker forvirulence.

Other assays for virulence utilize animal models. The M. tb. complexbacteria are able to infect a wide variety of animal hosts. One model ofparticular interest is cavitary tuberculosis produced in rabbits byaerosolized virulent tubercle bacilli (Converse et al. (1996) InfectImmun 64(11):4776-4787). In liquefied caseum, the tubercle bacilli growextracellularly for the first time since the onset of the disease andcan reach such large numbers that mutants with antimicrobial resistancemay develop. From a cavity, the bacilli enter the bronchial tree andspread to other parts of the lung and also to other people. Of thecommonly used laboratory animals, the rabbit is the only one in whichcavitary tuberculosis can be readily produced.

Use of Deletion Markers in Identification of Mycobacteria

The deletions provided in Table 1 are useful for the identification of amycobacterium as (a) variants of M. tb. (b) isolates of BCG (c) M. bovisstrains or (d) carrying the identified mycobacterial bacteriophage,depending on the specific marker that is chosen. Such screening isparticularly useful in determining whether a particular infection orisolate is pathogenic. The term mycobacteria may refer to any member ofthe family Mycobacteriacaeae, including M. tuberculosis, M. aviumcomplex, M. kansasii, M. scrofulaceum, M. bovis and M. leprae.

Means of detecting deletions are known in the art. Deletions may beidentified through the absence or presence of the sequences in mRNA orgenomic DNA, through analysis of junctional regions that flank thedeletion, or detection of the gene product, or, particularly relating tothe tuberculin skin test, by identification of antibodies that reactwith the encoded gene product.

While deletions can be easily determined by the absence ofhybridization, in many cases it is desirable to have a positive signal,in order to minimize artifactual negative readings. In such cases thedeletions may be detected by designing a primer that flanks the junctionformed by the deletion. Where the deletion is present, a novel sequenceis formed between the flanking regions, which can be detected byhybridization. Preferably such a primer will be sufficiently short thatit will only hybridize to the junction, and will fail to form stablehybrids with either of the separate parts of the junction.

Diagnosis is performed by protein, DNA or RNA sequence and/orhybridization analysis of any convenient sample, e.g. culturedmycobacteria, biopsy material, blood sample, etc. Screening may also bebased on the functional or antigenic characteristics of the protein.Immunoassays designed to detect the encoded proteins from deletedsequences may be used in screening.

A number of methods are available for analyzing nucleic acids for thepresence of a specific sequence. Where large amounts of DNA areavailable, genomic DNA is used directly. Alternatively, the region ofinterest is cloned into a suitable vector and grown in sufficientquantity for analysis. The nucleic acid may be amplified by conventionaltechniques, such as the polymerase chain reaction (PCR), to providesufficient amounts for analysis. The use of the polymerase chainreaction is described in Saiki, et al. (1985) Science 239:487, and areview of current techniques may be found in Sambrook, et al. MolecularCloning: A Laboratory Manual, CSH Press 1989, pp. 14.2-14.33.Amplification may also be used to determine whether a polymorphism ispresent, by using a primer that is specific for the polymorphism.Alternatively, various methods are known in the art that utilizeoligonucleotide ligation, for examples see Riley et al. (1990) N.A.R.18:2887-2890; and Delahunty et al. (1996) Am. J. Hum. Genet.58:1239-1246.

A detectable label may be included in an amplification reaction.Suitable labels include fluorochromes, e.g. fluorescein isothiocyanate(FITC), rhodamine, Texas Red, phycoerythrin, allophycocyanin,6-carboxyfluorescein (6-FAM),2′,7′-dimethoxy-4′,5′-dichloro-6-carboxyfluorescein (JOE),6-carboxy-X-rhodamine (ROX),6-carboxy-2′,4′,7′,4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein(5-FAM) or N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA), radioactivelabels, e.g. ³²P, ³⁵S, ³H; etc. The label may be a two stage system,where the amplified DNA is conjugated to biotin, haptens, etc. having ahigh affinity binding partner, e.g. avidin, specific antibodies, etc.,where the binding partner is conjugated to a detectable label. The labelmay be conjugated to one or both of the primers. Alternatively, the poolof nucleotides used in the amplification is labeled, so as toincorporate the label into the amplification product.

The sample nucleic acid, e.g. amplified or cloned fragment, is analyzedby one of a number of methods known in the art. The nucleic acid may besequenced by dideoxy or other methods, and the sequence of basescompared to the deleted sequence. Hybridization with the variantsequence may also be used to determine its presence, by Southern blots,dot blots, etc. The hybridization pattern of a control and variantsequence to an array of oligonucleotide probes immobilized on a solidsupport, as described in U.S. Pat. No. 5,445,934, or in WO95/35505, mayalso be used as a means of detecting the presence of variable sequences.Single strand conformational polymorphism (SSCP) analysis, denaturinggradient gel electrophoresis (DGGE), mismatch cleavage detection, andheteroduplex analysis in gel matrices are used to detect conformationalchanges created by DNA sequence variation as alterations inelectrophoretic mobility. Alternatively, where a polymorphism creates ordestroys a recognition site for a restriction endonuclease (restrictionfragment length polymorphism, RFLP), the sample is digested with thatendonuclease, and the products size fractionated to determine whetherthe fragment was digested. Fractionation is performed by gel orcapillary electrophoresis, particularly acrylamide or agarose gels.

The hybridization pattern of a control and variant sequence to an arrayof oligonucleotide probes immobilized on a solid support, as describedin U.S. Pat. No. 5,445,934, or in WO95/35505, may be used as a means ofdetecting the presence or absence of deleted sequences. In oneembodiment of the invention, an array of oligonucleotides is provided,where discrete positions on the array are complementary to at least aportion of M. tb. genomic DNA, usually comprising at least a portionfrom the identified open reading frames. Such an array may comprise aseries of oligonucleotides, each of which can specifically hybridize toa nucleic acid, e.g. mRNA, cDNA, genomic DNA, etc.

Deletions may also be detected by amplification. In an embodiment of theinvention, sequences are amplified that include a deletion junction,i.e. where the amplification primers hybridize to a junction sequence.In a nucleic acid sample where the marker sequence is deleted, ajunction will be formed, and the primer will hybridize, thereby allowingamplification of a detectable sequence. In a nucleic acid sample wherethe marker sequence is present, the primer will not hybridize, and noamplification will take place. Alternatively, amplification primers maybe chosen such that amplification of the target sequence will only takeplace where the marker sequence is present. The amplification productsmay be separated by size using any convenient method, as known in theart, including gel electrophoresis, chromatography, capillaryelectrophoresis, density gradient fractionation, etc.

In addition to the detection of deletions by the detection of junctionssequences, or detection of the marker sequences themselves, one maydetermine the presence or absence of the encoded protein product. Thespecific deletions in Table 1 correspond to open reading frames of theM. tb. genome, and therefore encode polypeptides. Polypeptides aredetected by means known in the art, including determining the presenceof the specific polypeptide in a sample through biochemical, functionalor immunological characterization. The detection of antibodies inpatient serum that react with a polypeptide is of particular interest.

Immunization with BCG typically leads to a positive response againsttuberculin antigens in a skin test. In people who have been immunized,which includes a significant proportion of the world population, it istherefore difficult to determine whether a positive test is the resultof an immune reaction to the BCG vaccine, or to an ongoing M. tb.infection. The subject invention has provided a number of open readingframe sequences that are present in M. tb. isolates, but are absent inBCG. As a primary or a secondary screening method, one may test forimmunoreactivity of the patient with the polypeptides encoded by suchdeletion markers. Diagnosis may be performed by a number of methods. Thedifferent methods all determine the presence of an immune response tothe polypeptide in a patient, where a positive response is indicative ofan M. tb. infection. The immune response may be determined bydetermination of antibody binding, or by the presence of a response tointradermal challenge with the polypeptide.

In one method, a dose of the deletion marker polypeptide, formulated asa cocktail of proteins or as individual protein species, in a suitablemedium is injected subcutaneously into the patient. The dose willusually be at least about 0.05 μg of protein, and usually not more thanabout 5 μg of protein. A control comprising medium alone, or anunrelated protein will be injected nearby at the same time. The site ofinjection is examined after a period of time for the presence of awheal. The wheal at the site of polypeptide injection is compared tothat at the site of the control injection, usually by measuring the sizeof the wheal. The skin test readings may be assessed by a variety ofobjective grading systems. A positive result for the presence of anallergic condition will show an increased diameter at the site ofpolypeptide injection as compared to the control, usually at least about50% increase in size, more usually at least 100% increase in size.

An alternative method for diagnosis depends on the in vitro detection ofbinding between antibodies in a patient sample and the subjectpolypeptides, either as a cocktail or as individual protein species,where the presence of specific binding is indicative of an infection.Measuring the concentration of polypeptide specific antibodies in asample or fraction thereof may be accomplished by a variety of specificassays. In general, the assay will measure the reactivity between apatient sample, usually blood derived, generally in the form of plasmaor serum. The patient sample may be used directly, or diluted asappropriate, usually about 1:10 and usually not more than about1:10,000. Immunoassays may be performed in any physiological buffer,e.g. PBS, normal saline, HBSS, dPBS, etc.

In a preferred embodiment, a conventional sandwich type assay is used. Asandwich assay is performed by first attaching the polypeptide to aninsoluble surface or support. The polypeptide may be bound to thesurface by any convenient means, depending upon the nature of thesurface, either directly or through specific antibodies. The particularmanner of binding is not crucial so long as it is compatible with thereagents and overall methods of the invention. They may be bound to theplates covalently or non-covalently, preferably non-covalently. Samples,fractions or aliquots thereof are then added to separately assayablesupports (for example, separate wells of a microtiter plate) containingsupport-bound polypeptide. Preferably, a series of standards, containingknown concentrations of antibodies is assayed in parallel with thesamples or aliquots thereof to serve as controls.

Immune specific receptors may be labeled to facilitate direct, orindirect quantification of binding. Examples of labels which permitdirect measurement of second receptor binding include radiolabels, suchas ³H or ¹²⁵I, fluorescers, dyes, beads, chemilumninescers, colloidalparticles, and the like. Examples of labels which permit indirectmeasurement of binding include enzymes where the substrate may providefor a colored or fluorescent product. In a preferred embodiment, thesecond receptors are antibodies labeled with a covalently bound enzymecapable of providing a detectable product signal after addition ofsuitable substrate. Examples of suitable enzymes for use in conjugatesinclude horseradish peroxidase, alkaline phosphatase, malatedehydrogenase and the like. Where not commercially available, suchantibody-enzyme conjugates are readily produced by techniques known tothose skilled in the art.

In some cases, a competitive assay will be used. In addition to thepatient sample, a competitor to the antibody is added to the reactionmix. The competitor and the antibody compete for binding to thepolypeptide. Usually, the competitor molecule will be labeled anddetected as previously described, where the amount of competitor bindingwill be proportional to the amount of Immune present. The concentrationof competitor molecule will be from about 10 times the maximumanticipated Immune concentration to about equal concentration in orderto make the most sensitive and linear range of detection.

Alternatively, antibodies may be used for direct determination of thepresence of the deletion marker polypeptide. Antibodies specific for thesubject deletion markers as previously described may be used inscreening immunoassays. Samples, as used herein, include microbialcultures, biological fluids such as tracheal lavage, blood, etc. Alsoincluded in the term are derivatives and fractions of such fluids.Diagnosis may be performed by a number of methods. The different methodsall determine the absence or presence of polypeptides encoded by thesubject deletion markers. For example, detection may utilize staining ofmycobacterial cells or histological sections, performed in accordancewith conventional methods. The antibodies of interest are added to thecell sample, and incubated for a period of time sufficient to allowbinding to the epitope, usually at least about 10 minutes. The antibodymay be labeled with radioisotopes, enzymes, fluorescers,chemiluminescers, or other labels for direct detection. Alternatively, asecond stage antibody or reagent is used to amplify the signal. Suchreagents are well known in the art. For example, the primary antibodymay be conjugated to biotin, with horseradish peroxidase-conjugatedavidin added as a second stage reagent. Final detection uses a substratethat undergoes a color change in the presence of the peroxidase. Theabsence or presence of antibody binding may be determined by variousmethods, including microscopy, radiography, scintillation counting, etc.

An alternative method for diagnosis depends on the in vitro detection ofbinding between antibodies and the subject polypeptides in solution,e.g. a cell lysate. Measuring the concentration of binding in a sampleor fraction thereof may be accomplished by a variety of specific assays.A conventional sandwich type assay may be used. For example, a sandwichassay may first attach specific antibodies to an insoluble surface orsupport. The particular manner of binding is not crucial so long as itis compatible with the reagents and overall methods of the invention.They may be bound to the plates covalently or non-covalently, preferablynon-covalently. The insoluble supports may be any compositions to whichpolypeptides can be bound, which is readily separated from solublematerial, and which is otherwise compatible with the overall method. Thesurface of such supports may be solid or porous and of any convenientshape. Examples of suitable insoluble supports to which the receptor isbound include beads, e.g. magnetic beads, membranes and microtiterplates. These are typically made of glass, plastic (e.g. polystyrene),polysaccharides, nylon or nitrocellulose. Microtiter plates areespecially convenient because a large number of assays can be carriedout simultaneously, using small amounts of reagents and samples.

Samples are then added to separately assayable supports (for example,separate wells of a microtiter plate) containing antibodies. Preferably,a series of standards, containing known concentrations of thepolypeptides is assayed in parallel with the samples or aliquots thereofto serve as controls. Preferably, each sample and standard will be addedto multiple wells so that mean values can be obtained for each. Theincubation time should be sufficient for binding, generally, from about0.1 to 3 hr is sufficient. After incubation, the insoluble support isgenerally washed of non-bound components. Generally, a dilute non-ionicdetergent medium at an appropriate pH, generally 7-8, is used as a washmedium. From one to six washes may be employed, with sufficient volumeto thoroughly wash non-specifically bound proteins present in thesample.

After washing, a solution containing a second antibody is applied. Theantibody will bind with sufficient specificity such that it can bedistinguished from other components present. The second antibodies maybe labeled to facilitate direct, or indirect quantification of binding.Examples of labels that permit direct measurement of second receptorbinding include radiolabels, such as ³H or ¹²⁵I, fluorescers, dyes,beads, chemilumninescers, colloidal particles, and the like. Examples oflabels which permit indirect measurement of binding include enzymeswhere the substrate may provide for a colored or fluorescent product. Ina preferred embodiment, the antibodies are labeled with a covalentlybound enzyme capable of providing a detectable product signal afteraddition of suitable substrate. Examples of suitable enzymes for use inconjugates include horseradish peroxidase, alkaline phosphatase, malatedehydrogenase and the like. Where not commercially available, suchantibody-enzyme conjugates are readily produced by techniques known tothose skilled in the art. The incubation time should be sufficient forthe labeled ligand to bind available molecules. Generally, from about0.1 to 3 hr is sufficient, usually 1 hr sufficing.

After the second binding step, the insoluble support is again washedfree of non-specifically bound material. The signal produced by thebound conjugate is detected by conventional means. Where an enzymeconjugate is used, an appropriate enzyme substrate is provided so adetectable product is formed.

Other immunoassays are known in the art and may find use as diagnostics.Ouchterlony plates provide a simple determination of antibody binding.Western blots may be performed on protein gels or protein spots onfilters, using a detection system specific for the polypeptide,conveniently using a labeling method as described for the sandwichassay.

The following examples are put forth so as to provide those of ordinaryskill in the art with a complete disclosure and description of how tomake and use the subject invention, and are not intended to limit thescope of what is regarded as the invention. Efforts have been made toensure accuracy with respect to the numbers used (e.g. amounts,temperature, concentrations, etc.) but some experimental errors anddeviations should be allowed for. Unless otherwise indicated, parts areparts by weight, molecular weight is average molecular weight,temperature is in degrees centigrade; and pressure is at or nearatmospheric.

Experimental Methods:

The technical methods used begin with extraction of whole genomic DNAfrom bacteria grown in culture.

Day 1

Inoculate culture medium of choice (LJ/7H9) and incubate at 35° C. untilabundant growth. Dispense 500 μl 1× TE into each tube. (If DNA is inliquid medium, no TE needed.) Transfer loopful (sediment) of cells intomicrocentrifuge tube containing 500 μl of 1× TE. If taking DNA fromliquid medium, let cells collect in bottom of flask. Pipette cells(about 1 ml) into tube. Heat 20 min at 80° C. to kill cells, centrifuge,resuspend in 500 μl of 1× TE. Add 50 μl of 10 mg/ml lysozyme, vortex,incubate overnight at 37° C.

Day 2

Add 70 μl of 10% SDS and 10 μl proteinase K, vortex and incubate 20 min.at 65° C. Add 100 μl of 5M NaCl. Add 100 μl of CTAB/NaCl solution,prewarmed at 65° C. Vortex until liquid content white (“milky”).Incubate 10 min at 65° C. Outside of hood, prepare new microcentrifugetubes labeled with culture # on top, and culture #, tube #, date onside. Add 550 μl isopropanol to each and cap. Back in the hood, add 750μl of chloroform/isoamyl alcohol, vortex for 10 sec. Centrifuge at roomtemp for 5 min. at 12,000 g. Transfer aqueous supernatant in 180 μlamounts to new tube using pipetter, being careful to leave behind solidsand non-aqueous liquid. Place 30 min at −20 C. Spin 15 min at room tempin a microcentrifuge at 12,000 g. Discard supernatant; leave about 20 μlabove pellet. Add 1 ml cold 70% ethanol and turn tube a few times upsidedown. Spin 5 min at room temp in a microcentrifuge. Discard supernatant;leave about 20 μl above the pellet. Spin 1 min in a microcentrifuge anddiscard cautiously the last 20 μl supernatant just above the pelletusing a pipetter (P-20). Be sure that all traces of ethanol are removed.Allow pellet to dry at room temp for 10 min or speed vac 2-3 min. (Placeopen tubes in speed vac, close lid, start rotor, turn on vacuum. After 3min. push red button, turn off vacuum, turn off rotor. Check if pelletsare dry by flicking tube to see if pellet comes away from side of tube.)Redissolve the pellet in 20-50 μl of ddH2O. Small pellets get 20,regular sized get 30 and very large get 50. DNA can be stored at 4° C.for further use.

DNA array: was made by spotting DNA fragments onto glass microscopeslides which were pretreated with poly-L-lysine. Spotting onto the arraywas accomplished by a robotic arrayer. The DNA was cross-linked to theglass by ultraviolet irradiation, and the free poly-L-lysine groups wereblocked by treatment with 0.05% succinic anhydride, 50%1-methyl-2-pyrrolidinone and 50% borate buffer.

The majority of spots on the array were PCR-derived products, producedby selecting over 9000 primer pairs designed to amplify the predictedopen reading frames of the sequences strain H37Rv. Some internalstandards and negative control spots including plasmid vectors andnon-M. tb. DNA were also on the array.

Therefore, with the preparation for an array that contained the wholegenome of Mycobacterium tuberculosis, we compared BCG-Connaught toMycobacterium tuberculosis, using the array for competitivehybridization. The protocol follows:

DNA labeling protocol. Add 4 μg DNA in 20 μl H₂O, 2 ml dN10N6 and 36 μlH₂O. 2 ml DNA spike for each DNA sample, for total of 60 μl. Boil 3minutes to denature DNA, then snap cool on ice water bath. Add 1 μl dNTP(5mM ACG), 10 μl 10 buffer, 4 μl Klenow, 22 μl H₂O to each tube. Add 3μl of Cy3 or Cy5 dUTP, for total of 100 μl. Incubate 3 hours at 37 C.Add 11 μl 3M NaAc, 250 μl 100% EtOH to precipitate, store O/N at −20 C.Centrifuge genomic samples 30 minutes at 13K to pellet precipitate.Discard supernatant, add 70% EtOH, spin 15 minutes, discard sup andspeed-vac to dry. This provides DNA for two experiments.

DNA hybridization to microarray. protocol. Resuspend the labeled DNA in11 μl dH₂O (for 2 arrays). Run out 1 μl DNA on a 1.5% agarose gel todocument sample to be hybridized. Of the remaining 10 μl of solution,half will be used for this hyb, and half will be left for later date.Take 5 μl of solution Cy3 and add to same amount of Cy5 solution, fortotal volume 10 μl mixed labeled DNA. Add 1 μl tRNA, 2.75 μl 20×SSC, 0.4μl SDS, for total volume 14.1 μl. Place on slide at array site, coverwith 22 mm coverslip, put slide glass over and squeeze onto rubberdevices, then hybridize 4 hours at 65 C. After 4 hours, remove arrayslides from devices, leave coverslip on, and dip in slide tray into washbuffer consisting of 1×SSC with 0.05% SDS for about 2 minutes. Coverslip should fall off into bath. After 2 minutes in wash buffer, dip onceinto a bath with 0.06×SSC, then rinse again in 0.06×SSC in separatebath. Dry slides in centrifuge about 600 rpm. They are now ready forscanning.

Fluorescence scanning and data acquisition. Fluorescence scanning wasset for 20 microns/pixel and two readings were taken per pixel. Data forchannel 1 was set to collect fluorescence from Cy3 with excitation at520 nm and emission at 550-600 nm. Channel 2 collected signals excitedat 647 nm and emitted at 660-705 nm, appropriate for Cy5. No neutraldensity filters were applied to the signal from either channel, and thephotomultiplier tube gain was set to 5. Fine adjustments were then madeto the photomultiplier gain so that signals collected from the two spotscontaining genomic DNA were equivalent.

To analyze the signal from each spot on the array, a 14×14 grid of boxeswas applied to the data collected from the array such that signals fromwithin each box were integrated and a value was assigned to thecorresponding spot. A background value was obtained for each spot byintegrating the signals measured 2 pixels outside the perimeter of thecorresponding box. The signal and background values for each spot wereimported into a spreadsheet program for further analysis. The backgroundvalues were subtracted from the signals and a factor of 1.025 wasapplied to each value in channel 2 to normalize the data with respect tothe signals from the genomic DNA spots.

Because the two samples are labeled with different fluorescent dyes, itis possible to determine that a spot of DNA on the array has hybridizedto Mycobacterium tuberculosis (green dye) and not to BCG (red dye), thusdemonstrating a likely deletion from the BCG genome.

However, because the array now contains spots representing 4000 spots,one may expect up to 100 spots with hybridization two standarddeviations above or below the mean. Consequently, we have devised ascreening protocol, where we look for mismatched hybridization in twoconsecutive genes on the genome. Therefore, we are essentially lookingonly for deletions of multiple genes at this point.

To confirm that a gene or group of genes is deleted, we perform Southernhybridization, employing a separate probe from the DNA on the array.Digestions of different mycobacterium DNAs are run on an agarose gel,and transferred to membranes. The membranes can be repeatedly used forprobing for different DNA sequences. For the purposes of this project,we include DNA from the reference strain of Mycobacterium tuberculosis(H37Rv), from other laboratory strains, such as H37Ra, the O strain,from clinical isolates, from the reference strain of Mycobacteriumbovis, and from different strains of Mycobacterium bovis BCG.

Once a deletion is confirmed by Southern hybridization, we then set outto characterize the exact genomic location. This is done by usingpolymerase chain reaction, with primers designed to be close to theedges of the deletion, see Talbot (1997) J Clin Micro. 35: 566-9

Primers have been chosen to amplify across the deleted region. Only inthe absence of this region does one obtain an amplicon. PCR productswere examined by electrophoresis (1.5% agarose) and ethidium bromidestaining.

Once a short amplicon is obtained, this amplicon is then sequenced. Asearch of the genome database is performed to determine whether thesequence is exactly identical to one part of the Mycobacteriumtuberculosis genome, and that the next part of the amplicon is exactlyidentical to another part of the Mycobacterium tuberculosis genome. Thispermits precise identification of the site of deletion.

Below follows an example of the kind of report obtained:

This process is repeated with each suggested deletion, beginning withthe three previously described deletions to serve as controls. Sixteendeletions have been identified by these methods, and are listed in Table1.

It is to be understood that this invention is not limited to theparticular methodology, protocols, formulations and reagents described,as such may, of course, vary. It is also to be understood that theterminology used herein is for the purpose of describing particularembodiments only, and is not intended to limit the scope of the presentinvention which will be limited only by the appended claims.

It must be noted that as used herein and in the appended claims, thesingular forms “a”, “and”, and “the” include plural referents unless thecontext clearly dictates otherwise. Thus, for example, reference to “acomplex” includes a plurality of such complexes and reference to “theformulation” includes reference to one or more formulations andequivalents thereof known to those skilled in the art, and so forth.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood to one of ordinary skill inthe art to which this invention belongs. Although any methods, devicesand materials similar or equivalent to those described herein can beused in the practice or testing of the invention, the preferred methods,devices and materials are now described.

All publications mentioned herein are incorporated herein by referencefor the purpose of describing and disclosing, for example, the celllines, constructs, and methodologies that are described in thepublications which might be used in connection with the presentlydescribed invention. The publications discussed above and throughout thetext are provided solely for their disclosure prior to the filing dateof the present application. Nothing herein is to be construed as anadmission that the inventors are not entitled to antedate suchdisclosure by virtue of prior invention.

1. An immunogenic composition, comprising: a substantially purepolypeptide encoded by a nucleotide sequence comprising the open readingframe Rv2653, SEQ ID NO:93 or a polypeptide encoded by a nucleotidefragment of at least 25 contiguous nucleotides of SEQ ID NO:93 or wheresaid polypeptide is fused to another peptide or protein; and apharmaceutically acceptable excipient.
 2. The immunogenic compositionaccording to claim 1, further comprising an adjuvant.
 3. The immunogeniccomposition according to claim 1, wherein said polypeptide is fused toanother peptide or protein.
 4. The immunogenic composition according toclaim 1, comprising a mycobacterium of the M. tuberculosis complex thathas been modified by introduction of said nucleotide sequence comprisingthe open reading frame Rv2653 (SEQ ID NO:93) or nucleotide fragment ofat least 25 contiguous nucleotides of SEQ ID NO:93.
 5. The immunogeniccomposition according to claim 4, wherein said mycobacterium of the M.tuberculosis complex is bacillus Calmette-Guerin.
 6. The immunogeniccomposition according to claim 4, wherein said mycobacterium of the M.tuberculosis complex is M. bovis.
 7. The immunogenic compositionaccording to claim 1, wherein said polypeptide is co-formulated with amycobacterium of the M. tuberculosis complex.
 8. The immunogeniccomposition of claim 7, wherein said mycobacterium of the M.tuberculosis complex is bacillus Calmette-Guerin.
 9. The immunogeniccomposition according to claim 7, wherein said mycobacterium of the M.tuberculosis complex is M. bovis.
 10. A method of immunizing anindividual to M. tuberculosis, the method comprising: injecting saidindividual with a mycobacterium of the M. tuberculosis complex that hasbeen modified to introduce a nucleotide sequence comprising the openreading frame Rv2653, SEQ ID NO:93 or a polypeptide encoded by anucleotide fragment of at least 25 contiguous nucleotides of SEQ IDNO:9, wherein said mycobacterium of the M. tuberculosis complex isbacillus Calmette-Guerin.
 11. A method of immunizing an individual to M.tuberculosis, the method comprising: injecting said individual with apolypeptide encoded by a nucleotide sequence comprising the open readingframe Rv2653, SEQ ID NO:93 or a polypeptide encoded by a nucleotidefragment of at least 25 contiguous nucleotides of SEQ ID NO:9 or wheresaid polypeptide is fused to another peptide or protein, wherein saidpolypeptide is co-formulated with bacillus Calmette-Guerin.
 12. Agenetically altered mycobacterium of the M. tuberculosis complex,comprising an exogenous nucleic acid sequence comprising the openreading frame Rv2653, SEQ ID NO:93 or a polypeptide encoded by anucleotide fragment of at least 25 contiguous nucleotides of SEQ IDNO:9.
 13. The genetically altered mycobacterium of claim 12, whereinsaid exogenous nucleic acid encodes a polypeptide that is fused toanother peptide or protein.
 14. The genetically altered mycobacterium ofclaim 12, wherein said mycobacterium is BCG.
 15. The mycobacterium ofclaim 12, and a physiologically acceptable carrier for injection.